Multi-agent artificial intelligence research promises a path to develop intelligent technologies that are more human-like and more human-compatible than those produced by "solipsistic" approaches, which do not consider interactions between agents. Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios. Each scenario pairs a physical environment (a "substrate") with a reference set of co-players (a "background population"), to create a social situation with substantial interdependence between the individuals involved. For instance, some scenarios were inspired by institutional-economics-based accounts of natural resource management and public-good-provision dilemmas. Others were inspired by considerations from evolutionary biology, game theory, and artificial life. Melting Pot aims to cover a maximally diverse set of interdependencies and incentives. It includes the commonly-studied extreme cases of perfectly-competitive (zero-sum) motivations and perfectly-cooperative (shared-reward) motivations, but does not stop with them. As in real-life, a clear majority of scenarios in Melting Pot have mixed incentives. They are neither purely competitive nor purely cooperative and thus demand successful agents be able to navigate the resulting ambiguity. Here we describe Melting Pot 2.0, which revises and expands on Melting Pot. We also introduce support for scenarios with asymmetric roles, and explain how to integrate them into the evaluation protocol. This report also contains: (1) details of all substrates and scenarios; (2) a complete description of all baseline algorithms and results. Our intention is for it to serve as a reference for researchers using Melting Pot 2.0.
translated by 谷歌翻译
我们介绍了Sparrow,这是一个寻求信息的对话代理,与提示的语言模型基线相比,训练有素,更有帮助,正确和无害。我们使用从人类反馈中的强化学习来培训我们的模型,以帮助人类评估者判断代理人的行为。首先,为了使我们的代理人更有帮助和无害,我们将良好对话的要求分解为代理人应遵循的自然语言规则,并分别向评估者询问每个规则。我们证明,这种崩溃使我们能够收集对代理行为的更多针对性的人类判断,并允许更有效的规则条件奖励模型。其次,我们的代理商在收集对模型声明的偏好判决时提供了支持事实主张的来源的证据。对于事实问题,麻雀提供的证据支持了78%的时间。比基线比基线更享受麻雀,同时对人类的对抗性探测更具弹性,在探测时只有8%的时间违反了我们的规则。最后,我们进行了广泛的分析,表明尽管我们的模型学会遵守我们的规则,但它可以表现出分布偏见。
translated by 谷歌翻译
Tumor segmentation in histopathology images is often complicated by its composition of different histological subtypes and class imbalance. Oversampling subtypes with low prevalence features is not a satisfactory solution since it eventually leads to overfitting. We propose to create synthetic images with semantically-conditioned deep generative networks and to combine subtype-balanced synthetic images with the original dataset to achieve better segmentation performance. We show the suitability of Generative Adversarial Networks (GANs) and especially diffusion models to create realistic images based on subtype-conditioning for the use case of HER2-stained histopathology. Additionally, we show the capability of diffusion models to conditionally inpaint HER2 tumor areas with modified subtypes. Combining the original dataset with the same amount of diffusion-generated images increased the tumor Dice score from 0.833 to 0.854 and almost halved the variance between the HER2 subtype recalls. These results create the basis for more reliable automatic HER2 analysis with lower performance variance between individual HER2 subtypes.
translated by 谷歌翻译
自动图像分析中的不确定性定量在许多应用中高度满足。通常,分类或细分中的机器学习模型仅用于提供二进制答案。但是,量化模型的不确定性可能在主动学习或机器人类互动中起关键作用。当使用基于深度学习的模型时,不确定性量化尤其困难,这是许多成像应用中最新的。当前的不确定性量化方法在高维实际问题中不能很好地扩展。可扩展的解决方案通常依赖于具有不同随机种子的相同模型的推理或训练集合过程中的经典技术,以获得后验分布。在本文中,我们表明这些方法无法近似分类概率。相反,我们提出了一个可扩展和直观的框架来校准深度学习模型的合奏,以产生近似分类概率的不确定性定量测量。在看不见的测试数据上,我们证明了与标准方法进行比较时的校准,灵敏度(三种情况中的两种)以及精度。我们进一步激发了我们在积极学习中的方法的用法,创建了伪标签,以从未标记的图像和人机合作中学习。
translated by 谷歌翻译
在Covid-19大流行中,本文的作者为数据科学领域的一所研究生院组织了一门加强学习(RL)课程。我们描述了尽管无处不在的变焦疲劳,但仍在定性地评估课程,以创造令人兴奋的学习体验的策略和材料。关键的组织特征是专注于团队中竞争性的动手设置,并提供了最少的讲座,从而提供了RL基本背景。该课程的实用部分围绕着Hearts Gym,这是我们作为RL的入门级教程开发的RL环境。参与者的任务是培训代理人探索奖励成型和其他RL超参数。为了进行最终评估,参与者的代理人相互竞争。
translated by 谷歌翻译
庞大的石油和天然气传输管道需要定期监测维护和危险检查,以避免设备故障和潜在事故。严重的Covid-19大流行情况迫使公司缩小了他们的团队的规模。面对现场的一种风险由不受控制的油气和天然气的不受控制的释放来表示。在许多检测方法中,无人驾驶飞行器系统含有柔韧性和稳定性。无人驾驶飞行器可以实时转移数据,而他们正在进行监控任务。本文专注于配备光学传感和人工智能的无人机车辆,尤其是具有深入学习技术的图像识别,用于管道监测。无人驾驶飞行器可用于定期巡逻职责,以识别和捕获感兴趣领域的图像和视频。难以达到的地方将进入更快,更便宜,风险较少。目前的论文基于捕获基于无人机的检验视频和图像的想法,这可能在危险之前发现几个潜在的危险问题。由于外管绝缘材料的包层弱化,损坏可以出现。当通过外部腐蚀的管道厚度可能发生时,也可能存在这种情况。本文介绍了石油和天然气行业专家完成的调查,用于寻找所提出的系统的功能和非功能性要求。
translated by 谷歌翻译